AITopics | stochastic gradient langevin

Collaborating Authors

stochastic gradient langevin

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

cb77649f5d53798edfa0ff40dae46322-Supplemental.pdf

Neural Information Processing SystemsFeb-11-2026, 04:44:22 GMT

Optimization is akeycomponent for training machine learning models and has a strong impact on their generalization. In this paper, we consider a particular optimization method--the stochastic gradient Langevin dynamics (SGLD) algorithm--and investigate the generalization of models trained by SGLD.

artificial intelligence, machine learning, wt 1, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

cb77649f5d53798edfa0ff40dae46322-Paper.pdf

Neural Information Processing SystemsFeb-11-2026, 04:44:18 GMT

artificial intelligence, generalization, machine learning, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Finite-Sample Bounds for Adaptive Inverse Reinforcement Learning using Passive Langevin Dynamics

Snow, Luke, Krishnamurthy, Vikram

arXiv.org Machine LearningSep-27-2023

This paper provides a finite-sample analysis of a passive stochastic gradient Langevin dynamics algorithm (PSGLD) designed to achieve adaptive inverse reinforcement learning (IRL). By passive, we mean that the noisy gradients available to the PSGLD algorithm (inverse learning process) are evaluated at randomly chosen points by an external stochastic gradient algorithm (forward learner) that aims to optimize a cost function. The PSGLD algorithm acts as a randomized sampler to achieve adaptive IRL by reconstructing this cost function nonparametrically from the stationary measure of a Langevin diffusion. Previous work has analyzed the asymptotic performance of this passive algorithm using weak convergence techniques. This paper analyzes the non-asymptotic (finite-sample) performance using a logarithmic-Sobolev inequality and the Otto-Villani Theorem. We obtain finite-sample bounds on the 2-Wasserstein distance between the estimates generated by the PSGLD algorithm and the cost function. Apart from achieving finite-sample guarantees for adaptive IRL, this work extends a line of research in analysis of passive stochastic gradient algorithms to the finite-sample regime for Langevin dynamics.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

arXiv.org Machine Learning

2304.09123

Country:

North America > United States > New York > Tompkins County > Ithaca (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
Europe > France > Occitanie > Haute-Garonne > Toulouse (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.90)

Add feedback

On the strong stability of ergodic iterations

Györfi, László, Lovas, Attila, Rásonyi, Miklós

arXiv.org Machine LearningApr-18-2023

We revisit processes generated by iterated random functions driven by a stationary and ergodic sequence. Such a process is called strongly stable if a random initialization exists, for which the process is stationary and ergodic, and for any other initialization, the difference of the two processes converges to zero almost surely. Under some mild conditions on the corresponding recursive map, without any condition on the driving sequence, we show the strong stability of iterations. Several applications are surveyed such as stochastic approximation and queuing. Furthermore, new results are deduced for Langevin-type iterations with dependent noise and for multitype branching processes.

artificial intelligence, iteration, machine learning, (15 more...)

arXiv.org Machine Learning

2304.04657

Country:

North America > United States > New York > Tompkins County > Ithaca (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.34)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Communications > Networks (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.48)

Add feedback

Stochastic Gradient Langevin with Delayed Gradients

Kungurtsev, Vyacheslav, Chatterjee, Bapi, Alistarh, Dan

arXiv.org Machine LearningJun-12-2020

Stochastic Gradient Langevin Dynamics (SGLD) ensures strong guarantees with regards to convergence in measure for sampling log-concave posterior distributions by adding noise to stochastic gradient iterates. Given the size of many practical problems, parallelizing across several asynchronously running processors is a popular strategy for reducing the end-to-end computation time of stochastic optimization algorithms. In this paper, we are the first to investigate the effect of asynchronous computation, in particular, the evaluation of stochastic Langevin gradients at delayed iterates, on the convergence in measure. For this, we exploit recent results modeling Langevin dynamics as solving a convex optimization problem on the space of measures. We show that the rate of convergence in measure is not significantly affected by the error caused by the delayed gradient information used for computation, suggesting significant potential for speedup in wall clock time. We confirm our theoretical results with numerical experiments on some practical problems.

artificial intelligence, machine learning, stochastic gradient langevin, (15 more...)

arXiv.org Machine Learning

2006.07362

Country:

Europe > Austria (0.04)
Europe > Czechia > Prague (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)

Add feedback

Extended Stochastic Gradient MCMC for Large-Scale Bayesian Variable Selection

Song, Qifan, Sun, Yan, Ye, Mao, Liang, Faming

arXiv.org Machine LearningFeb-7-2020

Stochastic gradient Markov chain Monte Carlo (MCMC) algorithms have received much attention in Bayesian computing for big data problems, but they are only applicable to a small class of problems for which the parameter space has a fixed dimension and the log-posterior density is differentiable with respect to the parameters. This paper proposes an extended stochastic gradient MCMC lgoriathm which, by introducing appropriate latent variables, can be applied to more general large-scale Bayesian computing problems, such as those involving dimension jumping and missing data. Numerical studies show that the proposed algorithm is highly scalable and much more efficient than traditional MCMC algorithms. The proposed algorithms have much alleviated the pain of Bayesian methods in big data computing.

algorithm, iteration, stochastic gradient langevin, (10 more...)

arXiv.org Machine Learning

2002.02919

Country:

North America > United States > Indiana > Tippecanoe County > West Lafayette (0.04)
North America > United States > Indiana > Tippecanoe County > Lafayette (0.04)
Europe > United Kingdom (0.04)
(2 more...)

Genre: Research Report (0.82)

Add feedback

Consistency and fluctuations for stochastic gradient Langevin dynamics

Teh, Yee Whye, Thiéry, Alexandre, Vollmer, Sebastian

arXiv.org Machine LearningJun-12-2015

Applying standard Markov chain Monte Carlo (MCMC) algorithms to large data sets is computationally expensive. Both the calculation of the acceptance probability and the creation of informed proposals usually require an iteration through the whole data set. The recently proposed stochastic gradient Langevin dynamics (SGLD) method circumvents this problem by generating proposals which are only based on a subset of the data, by skipping the accept-reject step and by using decreasing step-sizes sequence $(\delta_m)_{m \geq 0}$. %Under appropriate Lyapunov conditions, We provide in this article a rigorous mathematical framework for analysing this algorithm. We prove that, under verifiable assumptions, the algorithm is consistent, satisfies a central limit theorem (CLT) and its asymptotic bias-variance decomposition can be characterized by an explicit functional of the step-sizes sequence $(\delta_m)_{m \geq 0}$. We leverage this analysis to give practical recommendations for the notoriously difficult tuning of this algorithm: it is asymptotically optimal to use a step-size sequence of the type $\delta_m \asymp m^{-1/3}$, leading to an algorithm whose mean squared error (MSE) decreases at rate $\mathcal{O}(m^{-1/3})$

artificial intelligence, bayesian inference, machine learning, (16 more...)

arXiv.org Machine Learning

1409.0578

Genre: Research Report (0.65)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

Add feedback